Optimize IP field parsing #132463

felixbarny · 2025-08-05T16:39:34Z

Optimizes IP field parsing in the following ways:

Leverages XContentParser#optimizedTextOrNull to avoid the UTF-8 to Java String conversion overhead.
Avoids the expensive ipString.split(":") in favor of a more efficient algorithm that iterates over all character bytes only once.
Reduces memory allocations by avoiding a roundtrip through InetAddress. This requires creating an ESInetAddressPoint class that's similar to Lucene's InetAddressPoint as the latter can only be constructed via an InetAddress.

The semantics are kept in tact as-is and all IP parsing related test are still passing.

This could potentially hurt the performance of code paths that don't have access to an UTF-8 encoded byte array (but have a String) as the String will now need to be converted to a byte array first. However, these code paths don't seem to be as performance sensitive from a first glance.

martijnvg · 2025-08-11T06:41:24Z

This looks a nice improvement. Do you have an idea how this impacts indexing performance? For example by running metricsgenreceiver or test workload with just IPs?

elasticsearchmachine · 2025-08-11T09:07:17Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2025-08-11T09:07:17Z

Pinging @elastic/es-search-foundations (Team:Search Foundations)

felixbarny · 2025-08-11T09:07:58Z

Based on profiles I captured for 30s of a run of metricsgenreceiver before and after the optimizations, there are about 60% fewer samples for InetAddresses.forString(String) with this optimization (588 vs 1433 samples). In addition to that, optimizedTextOrNull is a little more efficient than textOrNull (229 vs 257 samples). With CBOR (after #132542), this gets significantly faster (115 samples). It's difficult to compare the full IpFieldMapper#parseCreateField time before and after because the run with the optimizations also included #132566.

romseygeek

LGTM, nice improvement.

Just to double check that I'm reading it correctly: for parsers that don't support optimizedText(), this will still end up doing just a single String-to-bytes conversion in the parser itself, right?

server/src/main/java/org/elasticsearch/index/mapper/ESInetAddressPoint.java

rjernst

I only reviewed the changes to InetAddresses.

server/src/main/java/org/elasticsearch/common/network/InetAddresses.java

ldematte

I think this PR would greatly benefit from some JMH benchmarks to show the differences and guide some development choices (e.g. specialized functions for String vs byte[])

server/src/main/java/org/elasticsearch/common/network/InetAddresses.java

felixbarny · 2025-08-19T07:59:38Z

I think this PR would greatly benefit from some JMH benchmarks to show the differences and guide some development choices (e.g. specialized functions for String vs byte[])

The approach I took here was to conduct metic ingestion benchmarks and analyzing cpu and allocation flame graphs to have a better understanding of the real-world impact outside of narrow microbenchmarks.

felixbarny · 2025-08-19T09:59:26Z

Sorry, I think I misinterpreted your suggestion. It makes sense to compare the performance of the String-based methods before and after this change to see what the difference is. It may actually not be a regression because of the other optimizations. Let's see.

felixbarny · 2025-08-19T10:18:11Z

I've added benchmarks and compared the before and after.

To summarize, the throughput is higher in all scenarios after the changes proposed in this PR. The only regression is that there are a more allocations when handling IP4v addresses.

                                                             Before (main)           After (this PR)
Benchmark                                (size)   Mode  Cnt       Score      Error       Score      Error   Units
encodeAsIpv6WithIpv4                       1000  thrpt    3                          25912.287 ±  425.865   ops/s
encodeAsIpv6WithIpv4:gc.alloc.rate.norm    1000  thrpt    3                          32000.027 ±    0.001    B/op
encodeAsIpv6WithIpv6                       1000  thrpt    3                           6748.165 ± 1197.371   ops/s
encodeAsIpv6WithIpv6:gc.alloc.rate.norm    1000  thrpt    3                          32000.104 ±    0.017    B/op
forStringIpv4Bytes                         1000  thrpt    3                          22505.172 ±  306.497   ops/s
forStringIpv4Bytes:gc.alloc.rate.norm      1000  thrpt    3                          80000.031 ±    0.001    B/op
forStringIpv6Bytes                         1000  thrpt    3                           6190.543 ± 1989.384   ops/s
forStringIpv6Bytes:gc.alloc.rate.norm      1000  thrpt    3                         152000.113 ±    0.037    B/op
forStringIpv4String                        1000  thrpt    3    18724.122 ± 345.067   22477.031 ±  209.926   ops/s
forStringIpv4String:gc.alloc.rate.norm     1000  thrpt    3    80000.037 ±   0.002  111992.031 ±    0.002    B/op
forStringIpv6String                        1000  thrpt    3     3356.420 ±  93.202    5582.589 ± 3434.972   ops/s
forStringIpv6String:gc.alloc.rate.norm     1000  thrpt    3   696000.209 ±   0.035  208000.125 ±    0.079    B/op
getIpOrHostIpv4                            1000  thrpt    3    18902.280 ± 352.800   22516.581 ±  476.120   ops/s
getIpOrHostIpv4:gc.alloc.rate.norm         1000  thrpt    3    80000.037 ±   0.001  112000.031 ±    0.001    B/op
getIpOrHostIpv6                            1000  thrpt    3     2056.111 ±  11.173    3186.665 ±  106.916   ops/s
getIpOrHostIpv6:gc.alloc.rate.norm         1000  thrpt    3  1104000.343 ±   0.049  616000.221 ±    0.035    B/op
isInetAddressIpv4                          1000  thrpt    3    25479.685 ±  66.369   25844.380 ±  178.405   ops/s
isInetAddressIpv4:gc.alloc.rate.norm       1000  thrpt    3    24000.027 ±   0.001   56000.027 ±    0.001    B/op
isInetAddressIpv6                          1000  thrpt    3     3513.745 ± 965.823    6981.567 ± 1365.645   ops/s
isInetAddressIpv6:gc.alloc.rate.norm       1000  thrpt    3   576000.200 ±   0.074   88000.100 ±    0.019    B/op

rjernst

LGTM

ldematte · 2025-08-20T06:22:09Z

I've added benchmarks and compared the before and after.

Thanks! Looks very good indeed!

Optimize IP field parsing

17eb2ae

felixbarny added >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types :StorageEngine/Mapping The storage related side of mappings labels Aug 5, 2025

elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 5, 2025

felixbarny and others added 3 commits August 5, 2025 19:44

Fix bug in convertDottedQuadToHex

55bb8d6

Small performance improvements for ipv4 parsing

31a6385

Merge branch 'main' into ip-parsing-optimization

11aafe4

felixbarny marked this pull request as ready for review August 11, 2025 09:06

felixbarny requested a review from a team as a code owner August 11, 2025 09:06

elasticsearchmachine added Team:StorageEngine Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Aug 11, 2025

felixbarny and others added 8 commits August 12, 2025 16:12

Reduce memory allocations by avoiding InetAddress

4c95f1f

Merge remote-tracking branch 'origin/main' into ip-parsing-optimization

725fae9

Avoid lambda overhead

02bc756

Avoid forbidden APIs

594bf80

Merge branch 'main' into ip-parsing-optimization

58c2611

Fix tests that expected an InetAddressPoint

0c16cab

More allocation optimizations for parsing ip4v addresses

e626e8f

Merge remote-tracking branch 'origin/main' into ip-parsing-optimization

f90f7f4

felixbarny requested a review from romseygeek August 15, 2025 06:53

romseygeek approved these changes Aug 15, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/ESInetAddressPoint.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/index/mapper/ESInetAddressPoint.java Show resolved Hide resolved

felixbarny mentioned this pull request Aug 18, 2025

Add OTLP metrics endpoint #133057

Closed

felixbarny added 2 commits August 18, 2025 09:03

Merge remote-tracking branch 'origin/main' into ip-parsing-optimization

3cb3979

Address comments from review

c4c48f4

rjernst requested changes Aug 18, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into ip-parsing-optimization

bfe7c62

ldematte reviewed Aug 19, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/common/network/InetAddresses.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/common/network/InetAddresses.java Outdated Show resolved Hide resolved

felixbarny added 2 commits August 19, 2025 08:51

Address review comments

349792d

Remove remaining use of Text

f201471

felixbarny added 2 commits August 19, 2025 12:13

Add benchmarks

90eacee

Merge remote-tracking branch 'origin/main' into ip-parsing-optimization

88afe92

felixbarny and others added 4 commits August 19, 2025 14:21

Simplify and optimize quad to hex conversion

e2fc667

Merge branch 'main' into ip-parsing-optimization

93e0691

[CI] Auto commit changes from spotless

61ecab5

Use root locale for String.format

2b1d9d1

rjernst approved these changes Aug 19, 2025

View reviewed changes

felixbarny mentioned this pull request Aug 19, 2025

TSDB ingest performance: combine routing and tsdb hashing #132566

Merged

3 tasks

felixbarny merged commit 6dae011 into elastic:main Aug 19, 2025
34 checks passed

felixbarny deleted the ip-parsing-optimization branch August 19, 2025 15:16

felixbarny mentioned this pull request Aug 20, 2025

Fix offset handling in ipv6 parsing #133192

Merged

felixbarny self-assigned this Aug 25, 2025

Optimize IP field parsing #132463

Optimize IP field parsing #132463

Uh oh!

Conversation

felixbarny commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnvg commented Aug 11, 2025

Uh oh!

elasticsearchmachine commented Aug 11, 2025

Uh oh!

elasticsearchmachine commented Aug 11, 2025

Uh oh!

felixbarny commented Aug 11, 2025

Uh oh!

romseygeek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

felixbarny commented Aug 19, 2025

Uh oh!

felixbarny commented Aug 19, 2025

Uh oh!

felixbarny commented Aug 19, 2025

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ldematte commented Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

felixbarny commented Aug 5, 2025 •

edited

Loading